The diffusion of NLP methods in marketing research

A systematic analysis

Olivier Caron

Paris Dauphine - PSL

Christophe Benavent

Paris Dauphine - PSL

February 13, 2024

Research context

NLP methods are increasingly used in marketing research

NLP methods enable the conversion of text into quantifiable data for in-depth analysis enabling the analysis of large volumes of text.

They are particularly suitable for marketing concerns:

  • Evaluate customer feebacks, gauging public sentiment
  • Detect emotional responses to products and marketing campaigns
  • Detect trends, consumer preferences and needs.

Specific research practices

Institutional Framework promote competition in academia driven by journal rankings and citation scores, influencing researcher compensation and career advancement. (Richard et al., 2015)

Concentration of research in major publishers (Elsevier, Thomson) with an increasing number of journals with facilitated access through bibliographic search interfaces.

What are the factors driving the diffusion of NLP methods? 1/2

Researcher must elaborate strategies to advance their career (Kolesnikov et al., 2018)

  • Balance between productivity, impact, originality, legitimacy, learning costs.

These strategies can be thought of in terms of the technology acceptance model from Davis (1989)

  • Adopt NLP methods based on interest (1) and ease of use (2)

What are the factors driving the diffusion of NLP methods? 2/2

NLP methods are used in close communities of practice (Hauser et al., 2006; Lave & Wenger, 1991)

Management trends (Abrahamson, 1991)

Data presentation

Articles

  • There are 683 articles and 2315 different authors

  • Date of publication range from 1985 to 2023

Data collection

  • All data were collected from Scopus.

Keywords

  • “natural language processing”, “nlp”, “embeddings”, “chatgpt”, “liwc”, “transformers”, “word2vec”, “wordtovec”, “lda”, “text mining”, “text-mining”, “text analysis”, “text analytics”, “text-analytics”, “text-analysis”

Journals

  • 156 journals ranked from the first to third quartile according to the SCImago Journal Rank in the field of marketing.

General data vizualisation

Production per affiliation

Citations per country

A focus on affiliations: number of productions

A focus on affiliations: number of citations

Management trend in publication volume

Networks & Communities

The structure of author networks (until 2015)

The structure of author networks (until 2023)

Some measures of network structure

The unequal spread of citations between communities

The top 5 communities account for 13.65% of authors (126) and 46% of citations (14851)

Topics

Topic modeling with STM

A concentric diffusion of topics

A focus on methods

Main NLP techniques used

  • “O” or “1” if a technique is detected in the articles

  • combined text of abstract, title, and keywords

  • detection of the technique name and the different spellings

bert_alt <- c(
  "bert",
  "bi-directional encoder representation from transformer",
  "bi-directional encoder representation from transformers",
  "bi-directional encoder representations from transformer",
  "bi-directional encoder representations from transformers",
  "bidirectional encoder representation from transformer",
  "bidirectional encoder representation from transformers",
  "bidirectional encoder representations from transformer",
  "bidirectional encoder representations from transformers",
  "bi directional encoder representation from transformer",
  "bi directional encoder representation from transformers",
  "bi directional encoder representations from transformer",
  "bi directional encoder representations from transformers"
)

Evolution of NLP techniques

Sentiment_Analysis LDA LIWC ChatGPT Embeddings Leximancer BERT Transformers STM Word2Vec LLM RoBERTa GPT2 PARA PassivePy NER TFIDF FastText TextRank GPT3 POS_Tagging
81 53 26 20 19 13 8 7 6 6 5 1 1 1 1 0 0 0 0 0 0

Diffusion and delay of NLP techniques

There is a significant time lag between the emergence of techniques and their actual use in marketing research:

  • LIWC: 1993 -> 2014
  • LDA: 2003 -> 2010

which is decreasing over time:

  • Embeddings: 2013 -> 2017

  • BERT: 2018 -> 2021

  • ChatGPT: 2022 -> 2023

Dominantly marketing: the selective borrowing of NLP methods

To conclude:

  1. NLP methods’ adoption in top-tier marketing journals suggests it’s not a mere trend but an innovation gaining solid ground in the field.
  2. The cost of learning NLP is decreasing over time, facilitated by a growing repository of shared knowledge.
  3. The integration and diffusion of NLP techniques are bolstered within communities of practice.
  4. Diffusion is driven by new research questions, data availability, and suitable techniques.
  5. Delay between the introduction of a technique and its adoption is decreasing over time.

References

Abrahamson, E. (1991). Managerial fads and fashions: The diffusion and rejection of innovations. The Academy of Management Review, 16(3), 586. https://doi.org/10.2307/258919
Davis, F. D. (1989). Perceived Usefulness, Perceived Ease of Use, and User Acceptance of Information Technology. MIS Quarterly, 13(3), 319. https://doi.org/10.2307/249008
Hauser, J., Tellis, G. J., & Griffin, A. (2006). Research on Innovation: A Review and Agenda for "Marketing Science". Marketing Science, 25(6,), 687–717. http://www.jstor.org/stable/40057216
Kolesnikov, S., Fukumoto, E., & Bozeman, B. (2018). Researchers risk-smoothing publication strategies: Is productivity the enemy of impact? Scientometrics, 116(3), 1995–2017. https://doi.org/10.1007/s11192-018-2793-8
Lave, J., & Wenger, E. (1991). Situated Learning: Legitimate Peripheral Participation (1st ed.). Cambridge University Press. https://doi.org/10.1017/CBO9780511815355
Richard, J., Plimmer, G., Fam, K.-S., & Campbell, C. (2015). Publishing success of marketing academics: antecedents and outcomes. European Journal of Marketing, 49(1/2), 123–145. https://doi.org/10.1108/EJM-06-2013-0311
Roberts, M. E., Stewart, B. M., Tingley, D., Lucas, C., Leder-Luis, J., Gadarian, S. K., Albertson, B., & Rand, D. G. (2014). Structural Topic Models for Open-Ended Survey Responses. American Journal of Political Science, 58(4), 1064–1082. https://doi.org/10.1111/ajps.12103

Appendix